Microblog Search Task at CLEF 2017: Query Generation using IR and LDA Topic Modeling Combination

نویسندگان

  • Malek Hajjem
  • Cherif Chiraz Latiri
چکیده

The microblogs search task at CLEF 2017 consists of developing a system to search the most relevant microblogs for cultural query in a collection about festivals in all languages. Our general approach to get this objective is the following: we propose to generate from the initial tweet queries, provided for the task, extended queries able to get an answer-rich set of microblogs. This is achieved using a thematic representation of tweet query extracted from microblog corpus. We investigate in this paper a novel method to improve topics learned from Twitter content without modifying the basic machinery of LDA. This latter is based on Information Retrieval (IR) process to generate a query-specific set of similar tweets. The result then represent the input of a basic LDA topic modeling process. Finally, the output thematic cluster serves as our source of expansion for the initial queries. keywords: CLEF, Microblogs Search, LDA, Information Retrieval, Aggregation

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Re-Ranking Microblogs Using Word2Vec in Microblog Search Task, University of Avignon

Working note aimed at proposing improved system from Indri Search Engine for the Microblog Search Task of MC2 CLEF 2017 lab. This improvement is tried thanks to Word2Vec model used in re-classing results in comparing them with query.

متن کامل

Comparative Evaluation of Query Expansion Methods for Enhanced Search on Microblog Data: DCU ADAPT @ SMERP 2017 Workshop Data Challenge

The rapid growth in the availability of social media content posted during emergency situations is creating significant interest in research into how this information can be exploited to assist emergency relief operations and to help with emergency preparedness and in early warning systems. We describe the DCU ADAPT Centre participation in the microblog search data challenge at the SMERP 2017 w...

متن کامل

QUT ielab at CLEF 2017 e-Health IR Task: Knowledge Base Retrieval for Consumer Health Search

In this paper we describe our participation to the CLEF 2017 e-Health IR Task [6]. This track aims to evaluate and advance search technologies aimed at supporting consumers to find health advice online. Our solution addressed this challenge by developing a knowledge base (KB) query expansion method. We found that the two best KB query expansion methods are mapping entity mentions to KB entities...

متن کامل

Incorporating Query Expansion and Quality Indicators in Searching Microblog Posts

We propose a retrieval model for searching microblog posts for a given topic of interest. We develop a language modeling approach tailored to microblogging characteristics, where redundancy-based IR methods cannot be used in a straightforward manner. We enhance this model with two groups of quality indicators: textual and microblog specific. Additionally, we propose a dynamic query expansion mo...

متن کامل

Report on the CLEF-IP 2011 Experiments: Exploring Patent Summarization

This technical report presents the work carried out for the Prior Art Candidate Search track of CLEF-IP 2011. In this search scenario, information need is expressed as a patent document (query topic). We compare two methods for estimating query model from the patent document to support summary-based query modeling and descriptionbased query modeling. The former approach utilizes a known text su...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017